The Making of the Royal Society Corpus
نویسندگان
چکیده
The Royal Society Corpus is a corpus of Early and Late modern English built in an agile process covering publications of the Royal Society of London from 1665 to 1869 (Kermes et al., 2016) with a size of approximately 30 million words. In this paper we will provide details on two aspects of the building process namely the mining of patterns for OCR correction and the improvement and evaluation of partof-speech tagging.
منابع مشابه
An Investigation of the Generic Features of Research Articles Published in the Bulletin of Iranian Mathematical Society
In light of the understanding that the analysis of the generic features of different academic genres can enhance the ability of non-native members of academic discourse communities to understand, and where appropriate, to produce them, the present study aimed at investigating the dominant generic structure of research articles in mathematics. To start with a relatively narrow focus, a corpus of...
متن کاملPainting and Society The Formation of the Persian Painting in the 14th Century
Persian painting has usually been studied from historical point of views. But its formation is rooted in a specific social context. In this study, we will try to contextualize it and we will show that this social context has a crucial role regarding its aesthetic. Persian painting is an art of royal courts and it represents the life of princes combined with Persian epic legendes. This social co...
متن کاملThe Royal Society Corpus: From Uncharted Data to Corpus
We present the Royal Society Corpus (RSC) built from the Philosophical Transactions and Proceedings of the Royal Society of London. At present, the corpus contains articles from the first two centuries of the journal (1665–1869) and amounts to around 35 million tokens. The motivation for building the RSC is to investigate the diachronic linguistic development of scientific English. Specifically...
متن کاملTreatment of hypospadias.
Release of the corpus is not so much obtained by the resection of tissue (chordee) as by dissection of the corpora cavernosa. It is important that this dissection should be completed. Too often, I have operated on patients with a so-called recurrence of chordee. My personal view is that no such process exists. A corpus cavernosum that is well released and has received an adequate skin cover sho...
متن کاملDeveloping a Corpus-Based Word List in Pharmacy Research Articles: A Focus on Academic Culture
The present corpus-based lexical study reports the development of a Pharmacy Academic Word List (PAWL); a list of the most frequent words from a corpus of 3,458,445 tokens made up of 800 most recent pharmacy texts including research articles, review articles, and short communications in four sub-disciplines of pharmacy. WordSmith (Scott, 2017) and AntWordProfiler (Anthony, 2014) were used to sc...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2017